Welcome to the July 2023 report from the
Reproducible Builds project. In our reports, we try to outline the most important things that we have been up to over the past month. As ever, if you are interested in contributing to the project, please visit the
Contribute page on our website.
Marcel Fourn et al. presented at the
IEEE Symposium on Security and Privacy in San Francisco, CA on
The Importance and Challenges of Reproducible Builds for Software Supply Chain Security.
As summarised in
last month s report, the abstract of their paper begins:
The 2020 Solarwinds attack was a tipping point that caused a heightened awareness about the security of the software supply chain and in particular the large amount of trust placed in build systems. Reproducible Builds (R-Bs) provide a strong foundation to build defenses for arbitrary attacks against build systems by ensuring that given the same source code, build environment, and build instructions, bitwise-identical artifacts are created. (PDF)
Chris Lamb published an interview with Simon Butler, associate senior lecturer in the School of Informatics at the
University of Sk vde, on
the business adoption of Reproducible Builds.
(This is actually the seventh instalment in a series featuring the projects, companies and individuals who support our project. We started this series by
featuring the Civil Infrastructure Platform project, and followed this up with a
post about the Ford Foundation as well as recent ones about
ARDC, the
Google Open Source Security Team (GOSST),
Bootstrappable Builds,
the F-Droid project and
David A. Wheeler.)
Vagrant Cascadian presented
Breaking the Chains of Trusting Trust at
FOSSY 2023.
Rahul Bajaj has been working with Roland Clobus on
merging an overview of environment variations to
our website:
I have identified 16 root causes for unreproducible builds in my empirical study, which I have linked to the corresponding documentation. The initial MR right now contains information about 10 root causes. For each root cause, I have provided a definition, a notable instance, and a workaround. However, I have only found workarounds for 5 out of the 10 root causes listed in this merge request. In the upcoming commits, I plan to add an additional 6 root causes. I kindly request you review the text for any necessary refinements, modifications, or corrections. Additionally, I would appreciate the help with documentation for the solutions/workarounds for the remaining root causes: Archive Metadata, Build ID, File System Ordering, File Permissions, and Snippet Encoding. Your input on the identified root causes for unreproducible builds would be greatly appreciated. [ ]
Just a reminder that our
upcoming Reproducible Builds Summit is set to take place from October 31st November 2nd 2023 in Hamburg, Germany.
Our summits are a unique gathering that brings together attendees from diverse projects, united by a shared vision of advancing the Reproducible Builds effort. During this enriching event, participants will have the opportunity to engage in discussions, establish connections and exchange ideas to drive progress in this vital field.
If you re interested in joining us this year, please make sure to read the
event page which has more details about the event and location.
There was more progress towards making the
Go programming language ecosystem reproducible this month, including:
In addition,
kpcyrd posted to our mailing list to report that:
while packaging govulncheck
for Arch Linux I noticed a checksum mismatch for a tar file I downloaded from go.googlesource.com
. I used diffoscope to compare the .tar
file I downloaded with the .tar
file the build server downloaded, and noticed the timestamps are different.
In Debian, 20 reviews of Debian packages were added, 25 were updated and 25 were removed this month adding to
our knowledge about identified issues. A number of issue types were updated, including marking
ffile_prefix_map_passed_to_clang
being fixed since Debian
bullseye [
] and adding a Debian bug tracker reference for the
nondeterminism_added_by_pyqt5_pyrcc5
issue [
].
In addition, Roland Clobus posted another
detailed update of the status of reproducible Debian ISO images on our mailing list. In particular, Roland helpfully summarised that live images are looking good, and the number of (passing) automated tests is growing .
Bernhard M. Wiedemann published another
monthly report about reproducibility within openSUSE.
F-Droid added 20 new reproducible apps in July, making 165 apps in total that are published with Reproducible Builds and using the upstream developer s signature. [
]
The
Sphinx documentation tool recently
accepted a change to improve deterministic
reproducibility of documentation. It s internal
util.inspect.object_description
attempts to sort collections, but this can fail. The change handles the failure case by using string-based object descriptions as a
fallback deterministic sort ordering, as well as adding recursive object-description calls for list and tuple datatypes. As a result,
documentation generated by Sphinx will be more likely to be automatically reproducible.
Lastly in news,
kpcyrd posted to our
mailing list announcing a new
repro-env
tool:
My initial interest in reproducible builds was how do I distribute pre-compiled binaries on GitHub without people raising security concerns about them . I ve cycled back to this original problem about 5 years later and built a tool that is meant to address this. [ ]
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
-
Bernhard M. Wiedemann:
-
Chris Lamb (lamby):
-
Johannes Schauer Marin Rodrigues (josch):
-
John Neffenger:
- openjdk/jfx#446 (openjfx), Enable reproducible builds with
SOURCE_DATE_EPOCH
, a three-and-a-half year effort started by Bernhard M. Wiedemann in January 2020, taken over by John Neffenger in March 2021, integrated upstream in June 2023, and available starting with JavaFX 21 on September 19, 2023.
In diffoscope development this month, versions 244
, 245
and 246
were uploaded to Debian unstable by Chris Lamb, who also made the following changes:
- Don t include the file size in image metadata. It is, at best, distracting, and it is already in the directory metadata. [ ]
- Add compatibility with
libarchive-5
. [ ]
- Mark that the
test_dex::test_javap_14_differences
test requires the procyon
tool. [ ]
- Initial work on DOS/MBR extraction. [ ]
- Move to using
assert_diff
in the .ico
and .jpeg
tests. [ ]
- Temporarily mark some Android-related as
XFAIL
due to Debian bugs #1040941 & #1040916. [ ]
- Fix the test skipped reason generation in the case of a version outside of the required range. [ ]
- Update copyright years. [ ][ ]
- Fix try.diffoscope.org. [ ]
In addition, Gianfranco Costamagna added support for LLVM version 16. [ ]
Testing framework
The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In July, a number of changes were made by Holger Levsen:
-
General changes:
- Upgrade Jenkins host to Debian bookworm now that Debian 12.1 is out. [ ][ ][ ][ ]
- djm: improve UX when rebooting a node fails. [ ]
- djm: reduce wait time between rebooting nodes. [ ]
-
Debian-related changes:
-
Various refactoring of the Debian scheduler. [ ][ ][ ]
- Make Debian live builds more robust with respect to salsa.debian.org returning HTTP 502 errors. [ ][ ]
- Use the legacy SCP protocol instead of the SFTP protocol when transfering Debian live builds. [ ][ ]
- Speed up a number of database queries thanks, Myon! [ ][ ][ ][ ][ ]
- Split
create_meta_pkg_sets
job into two (for Debian unstable and Debian testing) to half the job runtime to approximately 90 minutes. [ ][ ]
- Split scheduler job into four separate jobs, one for each tested architecture. [ ][ ]
- Treat more PostgreSQL errors as serious (for some jobs). [ ]
- Re-enable automatic database documentation now that
postgresql_autodoc
is back in Debian bookworm. [ ]
- Remove various hardcoding of Debian release names. [ ]
- Drop some i386 special casing. [ ]
-
Other distributions:
- Speed up Alpine SQL queries. [ ]
- Adjust CSS layout for Arch Linux pages to match 3 and not 4 repos being tested. [ ]
- Drop the community Arch Linux repo as it has now been merged into the extra repo. [ ]
- Speed up a number of Arch-related database queries. [ ]
- Try harder to properly cleanup after building OpenWrt packages. [ ]
- Drop all
kfreebsd
-related tests now that it s officially dead. [ ]
-
System health:
- Always ignore some well-known harmless orphan processes. [ ][ ][ ]
- Detect another case of job failure due to Jenkins shutdown. [ ]
- Show all non co-installable package sets on the status page. [ ]
- Warn that some specific reboot nodes are currently false-positives. [ ]
-
Node health checks:
- Run system and node health checks for Jenkins less frequently. [ ]
- Try to restart any failed
dpkg-db-backup
[ ] and munin-node services
[ ].
In addition, Vagrant Cascadian updated the paths in our automated to tests to use the same paths used by the official Debian build servers. [ ]
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via: